Skip to content

Conversation

seanmcguire12
Copy link
Member

@seanmcguire12 seanmcguire12 commented Aug 9, 2025

why

what changed

  • This PR adds some injected JS which
    1. intercepts Element.prototype.attachShadow early and stashes closed mode shadow roots in a WeakMap
    2. provides a 'backdoor' for safely accessing these closed roots without mutating anything in the actual DOM
  • to access the 'backdoor', this PR adds a custom locator engine selectors.register('stagehand', …)
  • the engine does a DFS over:
    • regular DOM nodes,
    • open shadow roots via el.shadowRoot,
    • closed roots via window.__stagehand__.getClosedRoot(el)
    • returns a regular playwright locator

note

  • all the logic here is behind the experimental flag in the stagehand constructor, so that we can give people access without breaking existing behaviour
  • this means that this feature is not available on the API (yet), and you'll need to set experimental: true in order to use it

test plan

  • added 8 different evals. They are primarily aimed at testing shadow dom interactions with the various types of shadow DOMs (open & closed mode) and iframes (OOPIFs & SPIFs)
  • will also run:
    • regression evals
    • act evals
    • extract evals
    • observe evals

Copy link

changeset-bot bot commented Aug 9, 2025

🦋 Changeset detected

Latest commit: 2dad35f

The changes in this PR will be included in the next version bump.

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@seanmcguire12 seanmcguire12 force-pushed the sean/stg-659-add-support-for-shadow-doms branch from bd4a3ab to be7f776 Compare August 12, 2025 00:19
@seanmcguire12 seanmcguire12 marked this pull request as ready for review August 12, 2025 01:22
@seanmcguire12 seanmcguire12 added act These changes pertain to the act function extract These changes pertain to the extract function observe These changes pertain to the observe function targeted-extract These changes pertain to targeted extract labels Aug 12, 2025
Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Greptile Summary

This PR introduces comprehensive Shadow DOM support to Stagehand, enabling interaction with elements inside both open and closed shadow roots. The implementation addresses multiple user-reported issues where Stagehand previously returned 'not-supported' errors for shadow DOM elements.

The solution uses a multi-layered approach: (1) JavaScript injection that intercepts Element.prototype.attachShadow early in the page lifecycle to capture closed shadow roots in a WeakMap, (2) a global window.__stagehand__ backdoor API providing safe access to closed shadow roots without DOM mutations, (3) a custom Playwright selector engine 'stagehand' that performs depth-first search traversal across regular DOM nodes, open shadow roots via el.shadowRoot, and closed roots via the backdoor API.

The feature integrates throughout the codebase by adding experimental flag support to all handlers (ActHandler, ExtractHandler, ObserveHandler), modifying XPath generation to include shadow root markers using '//' syntax, enhancing accessibility tree building to traverse shadow boundaries, and adding specialized error classes for shadow DOM failures. Eight comprehensive evaluation tests validate different combinations of shadow DOM modes (open/closed) with iframe contexts (OOPIF/SPIF).

The implementation is gated behind the experimental: true flag in the Stagehand constructor to prevent breaking existing behavior and is not yet available on the API. This architectural choice allows users to opt into the enhanced functionality while maintaining backward compatibility for production environments.

Confidence score: 3/5

  • This PR introduces significant complexity with experimental shadow DOM support that could cause issues if not thoroughly tested in production scenarios
  • Score reflects the experimental nature of the feature and potential edge cases in shadow DOM traversal, especially with the global window object modification and WeakMap approach
  • Pay close attention to lib/StagehandPage.ts where shadow DOM detection logic may incorrectly identify elements, and evaluate files that lack proper return statements for failure cases

17 files reviewed, 10 comments

Edit Code Review Bot Settings | Greptile

@seanmcguire12 seanmcguire12 removed act These changes pertain to the act function extract These changes pertain to the extract function observe These changes pertain to the observe function targeted-extract These changes pertain to targeted extract labels Aug 14, 2025
@seanmcguire12 seanmcguire12 merged commit 261bba4 into main Aug 14, 2025
19 of 30 checks passed
@github-actions github-actions bot mentioned this pull request Aug 14, 2025
miguelg719 pushed a commit that referenced this pull request Aug 19, 2025
This PR was opened by the [Changesets
release](https://github.com/changesets/action) GitHub action. When
you're ready to do a release, you can merge this and the packages will
be published to npm automatically. If you're not ready to do a release
yet, that's fine, whenever you add more changesets to main, this PR will
be updated.


# Releases
## @browserbasehq/[email protected]

### Patch Changes

- [#951](#951)
[`f45afdc`](f45afdc)
Thanks [@miguelg719](https://github.com/miguelg719)! - Patch GPT-5 new
api format

- [#954](#954)
[`261bba4`](261bba4)
Thanks [@seanmcguire12](https://github.com/seanmcguire12)! - add support
for shadow DOMs (open & closed mode) when experimental: true

- [#944](#944)
[`8de7bd8`](8de7bd8)
Thanks [@seanmcguire12](https://github.com/seanmcguire12)! - Bump zod
version compatibility and add pathing spec

- [#919](#919)
[`3d80421`](3d80421)
Thanks [@seanmcguire12](https://github.com/seanmcguire12)! - enable
scrolling inside of iframes

- [#963](#963)
[`0ead63d`](0ead63d)
Thanks [@tkattkat](https://github.com/tkattkat)! - Properly handle
images in evaluator + clean up response parsing logic

- [#961](#961)
[`8422828`](8422828)
Thanks [@tkattkat](https://github.com/tkattkat)! - Add more evals for
stagehand agent

- [#946](#946)
[`b769206`](b769206)
Thanks [@seanmcguire12](https://github.com/seanmcguire12)! - fix: unable
to act on/get content from some same process iframes

- [#962](#962)
[`72d2683`](72d2683)
Thanks [@seanmcguire12](https://github.com/seanmcguire12)! - handle
namespaced elements in xpath build step

## @browserbasehq/[email protected]

### Patch Changes

- Updated dependencies
\[[`f45afdc`](f45afdc),
[`261bba4`](261bba4),
[`8de7bd8`](8de7bd8),
[`3d80421`](3d80421),
[`0ead63d`](0ead63d),
[`8422828`](8422828),
[`b769206`](b769206),
[`72d2683`](72d2683)]:
    -   @browserbasehq/[email protected]

## @browserbasehq/[email protected]

### Patch Changes

- Updated dependencies
\[[`f45afdc`](f45afdc),
[`261bba4`](261bba4),
[`8de7bd8`](8de7bd8),
[`3d80421`](3d80421),
[`0ead63d`](0ead63d),
[`8422828`](8422828),
[`b769206`](b769206),
[`72d2683`](72d2683)]:
    -   @browserbasehq/[email protected]

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
@seanmcguire12 seanmcguire12 mentioned this pull request Aug 21, 2025
seanmcguire12 pushed a commit that referenced this pull request Aug 27, 2025
…tration (#1022)

# why

This is to fix some undesired behavior for a common dev workflow with
`next dev`. Introduced in
#954.

There is now module-level state in `lib/StagehandPage.ts` (the
`stagehandSelectorRegistered` boolean) used to guard against multiple
calls to `selectors.register` (a Playwright function which sets
module-level state).

This used in the function `ensureStagehandSelectorEngine`. This guard
exists because calling `selectors.register` with the same string more
than once will cause an error.

The problem is that `next dev` repeatedly reloads the `stagehand` module
whenever we first start up our dev server or make changes, but without
always reloading the underlying `playwright` module.

<details>
<summary>So, we get lots of errors like this.</summary>

```zsh
[2025-08-22 12:58:10] web:dev: Error in Inngest task {
[2025-08-22 12:58:10] web:dev:   error: {
[2025-08-22 12:58:10] web:dev:     error: 'NonRetriableError',
[2025-08-22 12:58:10] web:dev:     message: "Hey! We're sorry you ran into an error. \n" +
[2025-08-22 12:58:10] web:dev:       'Stagehand version: 2.4.3 \n' +
[2025-08-22 12:58:10] web:dev:       'If you need help, please open a Github issue or reach out to us on Slack: https://stagehand.dev/slack\n' +
[2025-08-22 12:58:10] web:dev:       '\n' +
[2025-08-22 12:58:10] web:dev:       'Full error:\n' +
[2025-08-22 12:58:10] web:dev:       'selectors.register: "stagehand" selector engine has been already registered',
[2025-08-22 12:58:10] web:dev:     name: 'Error',
[2025-08-22 12:58:10] web:dev:     stack: 'StagehandDefaultError: \n' +
[2025-08-22 12:58:10] web:dev:       "Hey! We're sorry you ran into an error. \n" +
[2025-08-22 12:58:10] web:dev:       'Stagehand version: 2.4.3 \n' +
[2025-08-22 12:58:10] web:dev:       'If you need help, please open a Github issue or reach out to us on Slack: https://stagehand.dev/slack\n' +
[2025-08-22 12:58:10] web:dev:       '\n' +
[2025-08-22 12:58:10] web:dev:       'Full error:\n' +
[2025-08-22 12:58:10] web:dev:       'selectors.register: "stagehand" selector engine has been already registered\n' +
[2025-08-22 12:58:10] web:dev:       '    at _StagehandPage.eval (webpack-internal:///(rsc)/../../node_modules/.pnpm/@[email protected][email protected][email protected][email protected][email protected]/node_modules/@browserbasehq/stagehand/dist/index.js:4077:15)\n' +
[2025-08-22 12:58:10] web:dev:       '    at Generator.throw (<anonymous>)\n' +
[2025-08-22 12:58:10] web:dev:       '    at rejected (webpack-internal:///(rsc)/../../node_modules/.pnpm/@[email protected][email protected][email protected][email protected][email protected]/node_modules/@browserbasehq/stagehand/dist/index.js:73:29)'
[2025-08-22 12:58:10] web:dev:   },
```
</details>

**TL;DR The `stagehand` module state guard, to guard the Playwright
module state, becomes out of sync with Playwright.**

This is not really `stagehand`'s "fault". It appears to be
`next`-specific behavior combined with some logic to get around funky
module-level `playwright` state. But it is causing a lot of friction on
our team; I think module-level state is risky in general for this
reason.

# what changed

My proposed fix is to wrap this `selectors.`register call in a specific
`try`/`catch` that looks for, and ignores, the specific error `/selector
engine has been already registered/` in
`packages/playwright-core/src/client/selectors.ts` instead of using the
`stagehandSelectorRegistered` boolean.

# test plan

Existing evals.

And this works locally as expected when I build our system against this
version, but without the error, no matter how many times the module is
reloaded.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants